Goto

Collaborating Authors

 geometric shape



ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Neural Information Processing Systems

Query embedding (QE)---which aims to embed entities and first-order logical (FOL) queries in low-dimensional spaces---has shown great power in multi-hop reasoning over knowledge graphs. Recently, embedding entities and queries with geometric shapes becomes a promising direction, as geometric shapes can naturally represent answer sets of queries and logical relationships among them. However, existing geometry-based models have difficulty in modeling queries with negation, which significantly limits their applicability. To address this challenge, we propose a novel query embedding model, namely \textbf{Con}e \textbf{E}mbeddings (ConE), which is the first geometry-based QE model that can handle all the FOL operations, including conjunction, disjunction, and negation. Specifically, ConE represents entities and queries as Cartesian products of two-dimensional cones, where the intersection and union of cones naturally model the conjunction and disjunction operations. By further noticing that the closure of complement of cones remains cones, we design geometric complement operators in the embedding space for the negation operations. Experiments demonstrate that ConE significantly outperforms existing state-of-the-art methods on benchmark datasets.



Calibrating Biased Distribution in VFM-derived Latent Space via Cross-Domain Geometric Consistency

Ma, Yanbiao, Dai, Wei, Liu, Bowei, Chen, Jiayi, Huang, Wenke, Wan, Guancheng, Lu, Zhiwu, Yan, Junchi

arXiv.org Artificial Intelligence

Abstract--Despite the fast progress of deep learning, one standing challenge is the gap of the observed training samples and the underlying true distribution. There are multiple reasons for the causing of this gap e.g. In the era of foundation models, we show that when leveraging the off-the-shelf (vision) foundation models (e.g., CLIP, DINOv2) for feature extraction, the geometric shapes of the resulting feature distributions exhibit remarkable transferability across domains and datasets. T o verify its practical usefulness, we embody our geometric knowledge-guided distribution calibration framework in two popular and challenging settings: federated learning and long-tailed recognition. In the federated setting, we devise a technique of acquiring the global geometric shape under privacy constraints, then leverage this knowledge to generate new samples for clients, in the aim of bridging the gap between local and global observations. In long-tailed learning, it utilizes the geometric knowledge transferred from sample-rich categories to recover the true distribution for sample-scarce tail classes. Comprehensive experiments show that our proposed geometric knowledge-guided distribution calibration effectively overcomes information deficits caused by data heterogeneity and sample imbalance, with boosted performance across benchmarks. It is often the case that the training data relied upon by models is often only a local [6], sparse [7], and biased observation [8] of the underlying ideal global data distribution. This distribution missing phenomenon manifests in various forms: in federated learning, it appears as label skew and domain skew due to data silos among clients [9], [10], [11], causing a severe misalignment between local data distributions and the global ideal distribution, thereby leading to divergent or even conflicting local optimization directions [12], [13], [14]. In long-tailed recognition, it is characterized by the extreme scarcity of samples in tail classes, preventing the model from capturing the true and complete shape of their distributions [15], [16]. Despite the differing scenarios, the essence is highly unified--models learn from incomplete information, lacking a comprehensive understanding of the overall structure of the real world. Conventional solutions, such as weighting loss functions [7], [17], [18], designing complex regularization terms [9], [14], [19], or aggregation strategies [20], [21], [22], primarily focus on post-hoc compensation at the optimization level. Y anbiao Ma and Zhiwu Lu are with the Gaoling School of Artificial Intelligence, Renmin University of China. Bowen Liu is with T singhua University.


Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning

Ma, Yanbiao, Dai, Wei, Huang, Wenke, Chen, Jiayi

arXiv.org Artificial Intelligence

Data heterogeneity in federated learning, characterized by a significant misalignment between local and global distributions, leads to divergent local optimization directions and hinders global model training. Existing studies mainly focus on optimizing local updates or global aggregation, but these indirect approaches demonstrate instability when handling highly heterogeneous data distributions, especially in scenarios where label skew and domain skew coexist. To address this, we propose a geometry-guided data generation method that centers on simulating the global embedding distribution locally. We first introduce the concept of the geometric shape of an embedding distribution and then address the challenge of obtaining global geometric shapes under privacy constraints. Subsequently, we propose GGEUR, which leverages global geometric shapes to guide the generation of new samples, enabling a closer approximation to the ideal global distribution. In single-domain scenarios, we augment samples based on global geometric shapes to enhance model generalization; in multi-domain scenarios, we further employ class prototypes to simulate the global distribution across domains. Extensive experimental results demonstrate that our method significantly enhances the performance of existing approaches in handling highly heterogeneous data, including scenarios with label skew, domain skew, and their coexistence. Code published at: https://github.com/WeiDai-David/2025CVPR_GGEUR


ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Neural Information Processing Systems

Query embedding (QE)---which aims to embed entities and first-order logical (FOL) queries in low-dimensional spaces---has shown great power in multi-hop reasoning over knowledge graphs. Recently, embedding entities and queries with geometric shapes becomes a promising direction, as geometric shapes can naturally represent answer sets of queries and logical relationships among them. However, existing geometry-based models have difficulty in modeling queries with negation, which significantly limits their applicability. To address this challenge, we propose a novel query embedding model, namely \textbf{Con}e \textbf{E}mbeddings (ConE), which is the first geometry-based QE model that can handle all the FOL operations, including conjunction, disjunction, and negation. Specifically, ConE represents entities and queries as Cartesian products of two-dimensional cones, where the intersection and union of cones naturally model the conjunction and disjunction operations. By further noticing that the closure of complement of cones remains cones, we design geometric complement operators in the embedding space for the negation operations.


Chain of Images for Intuitively Reasoning

Meng, Fanxu, Yang, Haotong, Wang, Yiding, Zhang, Muhan

arXiv.org Artificial Intelligence

The human brain is naturally equipped to comprehend and interpret visual information rapidly. When confronted with complex problems or concepts, we use flowcharts, sketches, and diagrams to aid our thought process. Leveraging this inherent ability can significantly enhance logical reasoning. However, current Large Language Models (LLMs) do not utilize such visual intuition to help their thinking. Even the most advanced version language models (e.g., GPT-4V and LLaVA) merely align images into textual space, which means their reasoning processes remain purely verbal. To mitigate such limitations, we present a Chain of Images (CoI) approach, which can convert complex language reasoning problems to simple pattern recognition by generating a series of images as intermediate representations. Furthermore, we have developed a CoI evaluation dataset encompassing 15 distinct domains where images can intuitively aid problem-solving. Based on this dataset, we aim to construct a benchmark to assess the capability of future multimodal large-scale models to leverage images for reasoning. In supporting our CoI reasoning, we introduce a symbolic multimodal large language model (SyMLLM) that generates images strictly based on language instructions and accepts both text and image as input. Experiments on Geometry, Chess and Common Sense tasks sourced from the CoI evaluation dataset show that CoI improves performance significantly over the pure-language Chain of Thoughts (CoT) baselines. The code is available at https://github.com/GraphPKU/CoI.


Geometric Projectors: Geometric Constraints based Optimization for Robot Behaviors

Chi, Xuemin, Löw, Tobias, Li, Yiming, Liu, Zhitao, Calinon, Sylvain

arXiv.org Artificial Intelligence

Generating motion for robots that interact with objects of various shapes is a complex challenge, further complicated when the robot's own geometry and multiple desired behaviors are considered. To address this issue, we introduce a new framework based on Geometric Projectors (GeoPro) for constrained optimization. This novel framework allows for the generation of task-agnostic behaviors that are compliant with geometric constraints. GeoPro streamlines the design of behaviors in both task and configuration spaces, offering diverse functionalities such as collision avoidance and goal-reaching, while maintaining high computational efficiency. We validate the efficacy of our work through simulations and Franka Emika robotic experiments, comparing its performance against state-of-the-art methodologies. This comprehensive evaluation highlights GeoPro's versatility in accommodating robots with varying dynamics and precise geometric shapes. For additional materials, please visit: https://www.xueminchi.com/publications/geopro


Flexible Multi-DoF Aerial 3D Printing Supported with Automated Optimal Chunking

Stamatopoulos, Marios-Nektarios, Banerjee, Avijit, Nikolakopoulos, George

arXiv.org Artificial Intelligence

The future of 3D printing utilizing unmanned aerial vehicles (UAVs) presents a promising capability to revolutionize manufacturing and to enable the creation of large-scale structures in remote and hard- to-reach areas e.g. in other planetary systems. Nevertheless, the limited payload capacity of UAVs and the complexity in the 3D printing of large objects pose significant challenges. In this article we propose a novel chunk-based framework for distributed 3D printing using UAVs that sets the basis for a fully collaborative aerial 3D printing of challenging structures. The presented framework, through a novel proposed optimisation process, is able to divide the 3D model to be printed into small, manageable chunks and to assign them to a UAV for partial printing of the assigned chunk, in a fully autonomous approach. Thus, we establish the algorithms for chunk division, allocation, and printing, and we also introduce a novel algorithm that efficiently partitions the mesh into planar chunks, while accounting for the inter-connectivity constraints of the chunks. The efficiency of the proposed framework is demonstrated through multiple physics based simulations in Gazebo, where a CAD construction mesh is printed via multiple UAVs carrying materials whose volume is proportionate to a fraction of the total mesh volume.


Global Big Data Conference

#artificialintelligence

Researchers from HSE University and Moscow Polytechnic University have discovered that AI models are unable to represent features of human vision due to a lack of tight coupling with the respective physiology, so they are worse at recognizing images. The results of the study were published in the Proceedings of the Seventh International Congress on Information and Communication Technology. To understand how machine perception of images differs from human perception, scientists uploaded images of classical visual illusions to the IBM Watson Visual Recognition online service. Most of them were geometric silhouettes, partially hidden by geometric shapes of the background color. The system tried to determine the nature of the image and indicated the degree of certainty in its response.